Scheduling Computationally Intensive Data Parallel Programs
نویسندگان
چکیده
We consider the problem of how to run a workload of multiple parallel jobs on a single parallel machine. Jobs are assumed to be data-parallel with large degrees of parallelism, and the machine is assumed to have an MIMD architecture. We identify a spectrum of scheduling policies between the two extremes of time-slicing, in which jobs take turns to use the whole machine, and space-slicing, in which jobs get disjoint subsets of processors for their own dedicated use. Each of these scheduling policies is evaluated using a metric suited for interactive execution: the minimum machine power being devoted to any job, averaged over time. The following result is demonstrated. If there is no advance knowledge of job characteristics (such as running time, I/O frequency and communication locality) the best scheduling policy is gang-scheduling with instruction-balance. This conclusion validates some of the current practices in commercial systems. This work is then extended to irregular jobs, i.e.,jobs in which the degree of parallelism varies during execution. It is shown that even though the scheduler may grant an irregular job substantial machine power, the job may not be able to harness that power e ciently unless it receives special run-time support in the form of load-balancing. Index Terms : parallel job scheduling, data-parallel, gang scheduling, local scheduling.
منابع مشابه
Employing a study of the robustness metrics to assess the reliability of dynamic loop scheduling ∗
To achieve best performance, scientific applications are executed on parallel and distributed heterogeneous computing systems. These applications often are computationally intensive, data parallel, irregular, and usually contain large loops that exhibit non-uniform characteristics depending upon their semantic structure during execution. These loops are the most data parallel and computationall...
متن کاملExecuting Communication-Intensive Irregular Programs Efficiently
We consider the problem of eÆciently executing completely irregular, communication-intensive parallel programs. Completely irregular programs are those whose number of parallel threads as well as the amount of computation performed in each thread vary during execution. Our programs run on MIMD computers with some form of space-slicing (partitioning) and time-slicing (scheduling) support. A hard...
متن کاملTask Scheduling Algorithm in GRID Considering Heterogeneous Environment
—This papers deals with a new task scheduling algorithm for distributed heterogeneous computing environments. In distributed and parallel computing system, efficient task scheduling of computationally intensive applications is one of the most essential and difficult issues. Although a large number of scheduling heuristics have been presented in the literature, most of them target only homogeneo...
متن کاملDesign and Implementation of the OpenMP Programming Interface on Linux-based SMP Clusters
Recently, cluster computing has successfully provided a cost-effective solution for data-intensive applications. In order to make the programming on clusters easy, many programming toolkits such as MPICH, PVM, and DSM have been proposed in past researches. However, these programming toolkits are not easy enough for common users to develop parallel applications. To address this problem, we have ...
متن کاملOptimization of Decentralized Scheduling for Physic Applications in Grid Environments
This paper presents a scheduling framework that is configured for, and used in physic systems. Our work addresses the problem of scheduling various computationally intensive and data intensive applications that are required for extracting information from satellite images. The proposed solution allows mapping of image processing applications onto available resources. The scheduling is done at t...
متن کامل